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Abstract 

Deep-space manned missions will require advanced 
automated health assessment capabilities. Requirements 
such as in-space assembly, long dormant periods and limited 
accessibility during flight, present significant challenges that 
should be addressed through Integrated System Health 
Management (ISHM). The ISHM approach will provide 
safety and reliability coverage for a complete system over its 
entire life cycle by determining and integrating health status 
and performance information from the subsystem and 
component levels. This paper will focus on the potential 
advanced diagnostic elements that will provide intelligent 
assessment of the subsystem health and the planned 
implementation of these elements in the ISHM Testbed and 
Prototypes (ITP) Project under the NASA Exploration 
Systems Research and Technology program. 

I. Introduction 

Long-duration space exploration missions can only be 
accomplished with “systems-of-systems” that are robust, 
autonomous, and prepared to work in harsh and unforgiving 


environments. In-space assembly, long dormant periods and 
limited accessibility during flight are system requirements 
and constraints that pose risks to mission success. These 
risks present significant challenges and give rise to 
fundamental questions. What information is required for 
safe and sustainable operation, and how can it be 
determined? What information must be transferred to 
external facilities, possibly on-ground, for further analysis, 
and what situations require immediate autonomous 
response? How will systems be certified for operations in 
space after assembly? What tests must be conducted and 
what parameters must be monitored prior to system 
operation and throughout any dormant periods? Health 
Management (HM) technologies create the necessary 
foundation for success in such missions by providing 
answers to these questions. 

In order to provide this foundation, an Integrated System 
Health Management (ISHM) system must be implemented. 
The ISHM system will provide safety and reliability 
coverage for a complete system over its entire life cycle, 
integrating health status and performance information from 
the subsystem and component levels to arrive at a system- 
level conclusion. The ISHM system will involve a collection 
of processing algorithms and intelligent elements at the 
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subsystem and system levels that will analyze the available 
data and report on the current status. 

This paper will outline such a future ISHM approach by 
surveying the potential advanced diagnostic elements that 
will provide intelligent assessment of the system health and 
the planned implementation of these elements in the ISHM 
Testbed and Prototypes (ITP) Project under the NASA 
Exploration Systems Research and Technology program. 
First this paper will briefly describe the basic diagnostic 
approaches, highlighting their capabilities, requirements and 
limitations. The paper will then provide detailed examples 
of recent implementations in space-based systems and the 
basic HM implementation issues that must be addressed. 

II. Potential ISHM Intelligent Elements 

While the ISHM system will involve new technologies in 
hardware as well as software, we will focus our attention on 
the intelligent elements only. Hardware advances in areas, 
such as sensors, communications, and processing, will 
impact development in software and vice versa. From a 
perspective of intelligence or autonomy, the ISHM system 
shall provide the following functions: 

• System Monitoring 

• Data Qualification 

• Feature/Information Extraction 

• Classification/Isolation/Diagnosis 

• Mission Projection/Prognosis 

• Communication/Information Transfer 

• System Recovery/Response 

Many, if not all, of these functions will require intelligent 
software elements in order to satisfy the anticipated system- 
level requirements of safety, reliability and sustainability 
that the NASA Exploration Systems Mission Directorate 
will impose. This paper will focus on potential elements 
required for the first five functions: system monitoring, data 
qualification, information extraction, diagnosis, and 
prognosis. 

A. Diagnostic Approaches 

Many diagnostics techniques have been developed and 
applied to space systems over the last twenty years. For the 
sake of discussion here, these approaches will be 
categorized as either Model-Based or Empirical in nature 
(ref. 1), keeping in mind that certain hybrid techniques will 
blur this distinction. The following definitions will be 
applied: 

Model-Based — First principle relationships are used to 
define the response of a system. Model-Based 
Diagnostic systems utilize a simulation of the 
monitored systems that can be either qualitative or 


quantitative, meaning the relationships in the model 
can reflect symbolic or numerical relationships. The 
diagnostic solution is an analysis relating the actual 
system to the simulation. Some examples of Model- 
Based Diagnostic systems are listed in table 1 . 

Empirical — Empirically-derived features or relation- 
ship information are used as indications of system 
response and state. This information can be derived 
from expert knowledge acquisition information or via 
statistical data analysis. An Empirical Diagnostic 
system utilizes this information to justify the 
diagnostic solution. Several prominent classes of 
Empirical-Based Diagnostic systems are shown in 
table 1. 

TABLE 1.— CLASSIFICATION OF SYSTEMS 



Model-based 

Empirical 

Diagnostic 

Technology 

Constraint-based 

Statistical 

• Livingstone, 

• BEAM, SPRT, 


MEXEC, State 

PCA 


Diagnosis, 

MARPLE, 

TEAMS 

Bayesian 

• Kalman Filters 

• Neural Networks 


Rule-based 

• SHINE, Mycin 

Particle Filters 


Bayesian 

• Hidden Markov 



Models 



One important class of empirically-based approaches, 
which can be used for feature extraction, fault detection, and 
diagnosis, is those based on data mining. Data mining seeks 
to discover previously unknown regularities or anomalies in 
large data sets. There are a number of commonly used 
techniques in data mining, including: 

• Basic Statistics: Statistical techniques are used to 
summarize, consolidate and generalize the information 
in large sets of data. 

• Artificial neural networks: Non-linear predictive 
models that learn through training and resemble 
biological neural networks in structure (ref. 2). 

• Decision trees: Tree-shaped structures that represent 
sets of decisions. These decisions generate rules for 
the classification of a dataset. (Examples: Class- 
ification and Regression Trees (CART) (ref. 3) and 
C4.5 (ref. 4)). 

• Genetic algorithms: Optimization techniques that use 
processes such as genetic combination, mutation, and 
natural selection in a design based on the concepts of 
evolution. 

• Nearest neighbor method: A technique that classifies 
each record in a dataset based on the classes of its 
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nearest neighbors in the feature space. (Example: Orca 
(ref. 5)). 

• Rule induction: The extraction of useful if-then rules 
from data based on statistical significance (Example: 
C4.5 rules (ref. 4)). 

Data mining applies these techniques to large data sets, and 
often combines them with relational database and on-line 
analytic processing (OLAP) technologies. 

In realistic applications, a simulation or model often relies 
upon prior process history in order to establish the 
relationships. General relationships may be generic for a 
component, but the actual simulation is often anchored by 
test data. For this reason a working distinction between the 
two categories in table 1 would be that a Model-Based 
approach uses physics in an attempt to add structure to the 
diagnostic system and in this way ensure completeness and 
attempt to provide coverage to unanticipated failures. On the 
other hand, Empirical techniques provide diagnostic 
coverage for systems where the explicit relationships are not 
known or the information from the system is sparse and not 
well distributed for diagnostic purposes. 

The requirements of the diagnostic system will influence 
the selection of the diagnostic technique. The diagnostic 
system may be a hybrid system containing elements from 
both categories. In addition, it may be a system that evolves 
from one category to another. For example, due to the sparse 
availability of data, the initial diagnostic system may be a 
simple collection of heuristic rules, and may evolve into a 
model of relationships as more information is gathered. 

Recent applications/developments . — State-of-the-art studies 
in the area of diagnostics and prognostics have been 
conducted several times recently (refs. 6 to 8). Using these 
studies as a guide, specific technology developments have 
been applied in the space-based vehicle arena and will be 
initial candidates in the ITP project. The lessons learned in 
the following sections that highlight four recent 
implementation areas will be used in the development of the 
ITP health management system. 

BEAM/SHINE: The NASA-developed Beacon-based 

Exceptions Analysis for Multimissions (BEAM) is an 
example of a hybrid technique with a strong empirical 
component. BEAM is a comprehensive self-analysis tool 
suitable as a monitor in many systems. BEAM seeks to 
mimic the logic of a human operator, and draws its training 
from many of the same sources. The fault detection, 
isolation, and prognostic conclusions are based upon 
physical models of arbitrary fidelity, symbolic models, 
example nominal data, real-time data, and architectural 
information such as connectivity and causal diagrams. 
BEAM detects anomalies by computing dynamical 
invariants (i.e., coefficients of an auto-regressive model) and 
comparing them to expected values as extracted from 
previous data. BEAM is particularly sensitive to anomalies 
caused by faulty sensors, subtle and sudden performance 
shifts, and unexpected transients, and can discriminate 


between these different event types. The algorithms have 
also been scaled to operate reliably on current-generation 
flight processors. 

BEAM has been applied to many different domains, 
including propulsion system fault detection, deep space 
radio antenna automation, hydraulic system condition-based 
maintenance, and spacecraft attitude and articulation control 
subsystem anomaly detection. It has been demonstrated on 
numerous military/civilian space and aircraft systems 
(refs. 9 and 10). 

A recent NASA demonstration of BEAM is the Space 
Shuttle Main Engine (SSME) anomaly detection system 
developed in a joint effort between JPL and the Marshall 
Space Flight Center (MSFC). It was developed as a 
prototype of an automated tool for rapid analysis of SSME 
data. As such, BEAM automatically indicates specific time 
periods, signals, and features contributing to each anomaly. 
For the SSME application, a custom version of BEAM was 
built to analyze data gathered during ground tests. BEAM 
was used to detect anomalies in seven different test data sets 
that contained some of the most commonly encountered 
anomalies in SSME testing. Overall, BEAM was sensitive to 
all of the major anomalies in the seven anomalous data sets 
and detected the shift in the data characteristics (ref. 11). 

Propulsion IVHM Technology Experiment (PITEX): The 
Propulsion IVHM (Integrated Vehicle Health Management) 
Technology Experiment (PITEX) was a subsystem health 
management demonstration performed under the Space 
Launch Initiative (SLI) Program. The PITEX objective was 
to mature and demonstrate key IVHM technologies on a 
relevant 2nd Generation Reusable Launch Vehicle (RLV) 
propulsion system. The PITEX demonstration was originally 
selected to fly on the X-34 RLV developed by Orbital 
Sciences Corporation under an earlier program. Although 
the X-34 program was cancelled, PITEX carried forward the 
previous research by building upon a prototype diagnostic 
system that was developed (refs. 12 to 14). 

PITEX was a complete health assessment package, 
containing both the data processing algorithms and an 
intelligent element, Livingstone (fig. 1). The Livingstone 
module is a model-based diagnostic engine that processes 
qualitative constraints of the monitored system and 
compares the anticipated output with the actual sensor 
information. If a discrepancy is found, the Livingstone 
engine attempts to determine conditions within the various 
components that would align to the current system state. The 
processing algorithms of PITEX, the Monitors and the Real- 
Time Interface, are tasked with providing system 
information in a timely and reliable fashion. 

Lack of experimental data and the need to demonstrate 
robustness to system variations required the extensive use of 
simulations to characterize nominal and failure conditions. 
The PITEX demonstration used flight-like data (noise, 
sensor resolution, and hardware uncertainties) and realistic 
nominal and failure modes, supported real-time operation, 
and addressed computer resource management issues. 
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Flight Software 


Telemetry 



Figure 1. — PITEX demonstration architecture. 


PITEX demonstrated the potential of Livingstone’s 
qualitative model-based diagnostic engine in propulsion 
system health assessment. Lessons learned in the 
development of the PITEX demonstration software, as well 
as the demonstration itself (simulation development, test 
metrics and testing suites) are a valuable resource to the ITP 
project. 

X-37 IVHM Experiment : The X-37 IVHM Experiment 
was a complementary effort to PITEX in several ways. The 
goals were similar: to integrate and mature IVHM 
technology and demonstrate in flight. The experiment 
integrated Livingstone 2 with the X-37’s flight software, 
which involved developing monitors and a real-time 
interface. The X-37 IVHM Experiment modelled the X-37’s 
electrical power system and electro-mechanical actuators 
instead of the propulsion system, and while PITEX was 
planned to have its own computer onboard the X-34, the 
X-37 IVHM Experiment was designed to run on the same 
flight computer as the X-37’s critical flight software. Both 
of these differences provided important opportunities to 
mature the HM technologies, expanding into different 
critical spacecraft systems and flight-qualified computing 
resources. The decision to run it on the flight computer 
resulted in safety requirements to ensure that the IVHM 
software could not interfere with critical flight software. It 
also resulted in very tight CPU and memory resource usage 
requirements. Although this experiment did not reach flight 
testing phase, it was a valuable demonstration of IVHM 
capabilities meeting these stringent requirements (ref. 15). 

BEAM-Livingstone Integration : BEAM-Livingstone 

Integration was an effort to create a prototype hybrid 
reasoning system utilizing the strengths of Jet Propulsion 
Laboratory’s (JPL) BEAM and Ames Research Center’s 
(ARC) Livingstone technologies under NASA’s Strategic 
Launch Initiative (SLI) program. The effort demonstrated 
the feasibility of integrating BEAM, a continuous domain 
feature-based detector, and Livingstone, a discrete domain 


model-based reasoner, to create a hybrid diagnostic system. 
The hybrid diagnostic system was validated on one scenario 
from the PITEX simulation of the X-34 main propulsion 
feed system. In the scenario, Livingstone could not 
distinguish between a double regulator failure and a pressure 
sensor failure. By integrating BEAM as a virtual sensor into 
Livingstone, it provided an independent source of evidence, 
and the hybrid system was able to correctly diagnose the 
proper failure without an ambiguity. The results of the 
hybrid reasoner demonstrated the synergistic benefits of 
integrating BEAM and Livingstone (ref. 16). 


III. Implementation Issues 

There are several implementation issues that need to be 
defined and addressed. Each issue will impact not only the 
type of diagnostic technique applied, but also the 
effectiveness of the HM application. Figure 2 provides a 
graphical view of the issues involved in HM system design. 
Each issue group will be discussed in greater detail. 

A. Monitored System Design 

First, the system or subsystem to be monitored must be 
defined. Issues, like what measurements are available and 
what failures can be detected, need to be identified. The next 
step would be the characterization of the “healthy” system, 
as well as these identified failures. This may be done 
through historical assessment of the system using expert 
domain knowledge, system modelling and data mining 
technologies. Diagnostic techniques depend critically on the 
behaviors and modes of the system being managed and 
require in-depth knowledge of nominal operations and 
failure modes. To support both development and verification 
and validation, the system behaviors and modes must be 
modeled in a simulation of sufficient fidelity to properly 
reflect system responses to events. 
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Along with the ability to detect the fault condition, the 
ability to isolate failures needs to be evaluated. At a 
minimum, failures only require isolation when the 
remediation strategy differs between the faults. It should be 
noted that the remediation strategy for the same failure may 
vary greatly depending on the mission objectives, mission 
phase and current system configuration. Once the feature 
processing and diagnostic requirements are established, the 
proper health management techniques can be selected and 
developed. 

The success of any diagnostic system is dependent on the 
diagnostic algorithms’ awareness of the system behaviors 
and modes. The definition of sensor types and placement 
can be derived directly from the behavior and mode models 
developed in knowledge capture. Sensor selection and type 
will also need to be commensurate with the application 
environment of the space based system. Therefore, sensor 
selection will be based on the diagnostic benefit provided, 
weighed against the cost. For future system designs, each 
component, including sensors and software, will need to 
“buy” its way on-board. 

B. HM Processing Requirements 

The processing constraints for the intelligent elements 
must also be determined. In order to provide the signal- 
processing and diagnostic analysis, each element will 
require system resources in the form of CPU and memory. 
There is also communication bandwidth and information 
storage to consider. The required response time for each 
monitored failure must be considered. Failure manifestation 
can be on the order of milliseconds to days. How quickly the 
diagnostic system is required to detect and resolve the 
failure will drive the applicable technique and the system 
resource load required by that technique. Finally, every HM 
element will require resources for development and 
validation. These resources will generally take the form of 
historical information; the amount of information required 
and how the information will be utilized will depend on the 
selected technique. 

C. HM Performance Requirements 

Another set of issues to consider is the performance 
requirements for the HM elements and the verification and 
validation process. HM elements should demonstrate the 
ability to scale and evolve as the monitored system matures 
in its development. These elements should also demonstrate 
robustness to common sources of system uncertainty, such 
as sensor signal noise, build-to-build variation and 
environmental condition changes. The ability to handle 
system uncertainty should not require the HM element to 
become insensitive to failure detection. This is a common 
trade-off between competing HM requirements (false alarms 
versus missed detections) that needs to be addressed. 


D. HM Interface Requirements 

Finally, the interface requirements for each element must 
be defined. Each element will need to determine the health 
assessment information that will be transferred externally. 
The element may also require input from other elements, 
such as an understanding of current system state/operation. 
How the element will behave if this external information is 
not available or corrupted, must be considered. These are 
important ISHM design issues that could severely impact 
the performance of the HM system 

The ISHM system design process should be conducted in 
tandem with the overall vehicle design and development. 
The ISHM requirements in addressing these issues could 
push back on the overall vehicle or system design and 
development. 

III. Verification and Validation Techniques 

Verification and validation (V&V) of diagnostic 
algorithms is a challenging and critical phase of ISHM 
development. The V&V of these algorithms requires not 
only operational systems, but the ability to test systems to 
failure to demonstrate the ability of the diagnostic 
algorithms to identify nominal conditions and failure 
conditions. This can be accomplished through an integrated 
series of software simulations, hardware-in-the-loop 
simulations, and hardware testing. Each phase provides 
differing abilities to exercise failure modes and demonstrate 
diagnostic performance. 

A. Simulation Definition and Development 

System simulation can be developed directly from the 
system behaviors and modes incorporating defined sensor 
responses. These simulations should provide an accurate, 
physics-based model of the system. These simulations 
incorporate nominal operating conditions as well as failure 
modes to exercise diagnostics. Execution of simulated 
systems will need to account for realistic system conditions, 
multi-processing requirements, communications and data 
buses (data processing and throughput), latencies, 
environment effects, real-time sensor and data fusion, 
system hardware characteristics and interfaces, supplied 
system behavior models and interfaces, computing and 
storage resources, and fault insertion scenarios. 
Furthermore, the simulation facility shall require the 
following features: software performance monitoring, sensor 
health monitoring, data integrity, fault detection and quick 
assertion and analysis, and identification of false positive 
conditions. By incorporating the diagnostic algorithms into 
the software simulations, a complete investigation of the 
algorithms can be conducted. 
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B. Simulation Execution (S/W) 

Software simulations of the system being monitored and 
the diagnostic system provide a rich environment in which 
to evaluate, verify, and validate diagnostic performance. 
Failure modes, which may not be feasibly executed in 
hardware, can be exercised in the software simulation. In 
addition, the software system can be run faster than real 
time, allowing many more cases to be verified and validated 
than when using a hardware system. This does not replace 
hardware testing, but allows a level of investigation into 
scenarios and state spaces not otherwise available. 

C. Hardware-In-The-Loop Simulation 

Hardware-in-the-loop simulation would further define and 
qualify the underlying hardware and software diagnostic 
technologies. In this case, flight or prototype hardware 
executes the diagnosis software interactively with the 
software simulation of the system being monitored. System 
hardware can be added in this case to test interaction of the 
diagnostic systems with hardware components. With this 
capability integrated system performance testing, integrated 
procedure testing, operations and maintenance requirements 
development as well as operator training and familiarization 
could be performed. The capability to perform real-time, 
hardware-in-the-loop simulation of space vehicles and their 
subsystems will better provide for realistic system fault 
scenarios and performance assessment. Extensive resources 
for data processing, data archival, and hosting special 
configuration requirements will be required. This testing 
could not be performed faster than real-time due to the 
implementation of the diagnostic system hardware. 
However, failure modes not feasible in a hardware test, such 
as faults that are difficult or hazardous to induce or that 
require extended state space testing not practical with 
hardware systems due to limited life of components, testing 
time frames and costs, can be evaluated. In addition, 
performance of the diagnostic system can be validated by 
incorporating the diagnostic hardware into the simulation. 
Results of this testing can be fed back to the software 
simulation of the diagnostic system to improve the 
diagnostic system behavior modeling. Hardware-in-the-loop 
testing can also be used to identify scenarios which need full 
hardware testing, thus defining the hardware test program. 

D. Hardware System Test 

An integrated hardware system test environment is 
necessary to perform complete diagnostic system testing. 
Based on results of the software simulation and hardware- 
in-the-loop testing, full hardware tests will be conducted to 
V&V the diagnostic system performance in nominal 
operational scenarios as well as selected failure modes. 
These tests also provide the best medium to V&V overall 
system performance. Feedback from this testing can be 


provided to improve the software simulations used for 
development and hardware-in-the-loop testing. 

V. ITP Intelligent Elements 

For the ITP Project, there are two distinct areas where 
multiple intelligent elements will be applied. The first area 
will be the insertion of real-time health assessment 
capabilities within a rocket component test facility at NASA 
Stennis Space Center (SSC). This activity will involve the 
development and integration of advanced hardware and 
software elements within a controlled, relevant environment 
in order to assess the health management benefits and 
capabilities. The second area within the ITP Project involves 
using the International Space Station (ISS) as a testbed for 
ISHM software. It will include the use of historical ISS data, 
simulated ISS data, and possibly near-real-time ISS data to 
validate the performance of a variety of ISHM algorithms. 

For the ITP project, diagnostic intelligent elements will be 
specifically developed for the E2 rocket component test 
facility at NASA SSC. Information processing will be 
performed on historical data from the testing facility, using 
conventional signal processing algoithms (legacy algorithms 
from earlier propulsion health assessment projects, including 
PITEX) and the BEAM processing and classification 
capabilities. The resulting detection and diagnostic elements 
will be incorporated within a G2 software framework of the 
monitored system. In Phase 1, a focused demonstration 
system will be developed to highlight the potential HM 
capabilities. In Phase 2, the detection and diagnostic 
elements will be expanded in fault coverage, and more 
sophisticated techniques and elements will be incorporated 
in the hybrid system. 

In the ITP project, two data mining approaches will be 
explored to improve fault detection, diagnosis, and failure 
prognosis. In Phase 1, historical data from the SSC test 
stand will be analyzed using unsupervised anomaly 
detection algorithms, which only use nominal training data 
to generate a model of nominal sensor data. During run- 
time, the algorithm signals an anomaly when sensor data no 
longer fits the model. In Phase 2, supervised anomaly 
detection will be added, which learns to distinguish nominal 
and off-nominal patterns based on past examples of both, 
and can be used for diagnosis by learning to distinguish 
among examples of different types of faults. These 
algorithms will also be applied to data from the International 
Space Station during Phase 2. A variety of data mining 
algorithms that have been proven in other applications will 
be applied during this project, such as Orca, an unsupervised 
anomaly detection algorithm previously used for Earth 
science and aviation security applications (ref. 5), and C5.0, 
a supervised decision tree induction system from Rulequest 
Research. 

In both of these activity areas, implementation and V&V 
issues will need to be addressed throughout Phase 1 and 2. 


NAS A/TM— 2005-2 1 3 849 


6 



The architectures for both of these research projects will 
need to incorporate individual elements into a unified hybrid 
diagnostic system that allows for expansion and evolution. 
In addition, simulation and testing capabilities need to be 
established that will enable the proper evaluation of the final 
ISHM product. 

VI. Summary 

For the ITP project, we have begun to identify and define 
the applied system and develop the requirements for the 
ISHM elements. Selection of HM techniques for this project 
will be based on the lessons learned in recent applications of 
BEAM and Livingstone, as well as experience in data 
mining activities. We have outlined the potential design, 
development and testing issues that must be addressed: 

• System definition and characterization 

• Simulation development 

• HM requirements definition 

• HM element selection and development 

• ISHM Evaluation Testing 
o Simulation 

o Hardware-In-The-Loop Simulation 
o Hardware System 

We also discussed the specific research areas within the ITP 
project from the intelligent element perspective. Along with 
the HM requirements, implementation and development 
issues will be anticipated and resolved as part of the 
research in each element development process. 
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